Knowledge Discovery in Variant Databases Using Inductive Logic Programming
نویسندگان
چکیده
Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work, we describe the use of a recent knowledge discovery from database (KDD) approach using inductive logic programming (ILP) to automatically extract knowledge about human monogenic diseases. We extracted background knowledge from MSV3d, a database of all human missense variants mapped to 3D protein structure. In this study, we identified 8,117 mutations in 805 proteins with known three-dimensional structures that were known to be involved in human monogenic disease. Our results help to improve our understanding of the relationships between structural, functional or evolutionary features and deleterious mutations. Our inferred rules can also be applied to predict the impact of any single amino acid replacement on the function of a protein. The interpretable rules are available at http://decrypthon.igbmc.fr/kd4v/.
منابع مشابه
Knowledge Discovery in databases - An Inductive Logic Programming Approach
The need for learning from databases has increased along with their number and size. The new eld of Knowledge Discovery in Databases (KDD) develops methods that discover relevant knowledge in very large databases. Machine learning, statistics, and database methodology contribute to this exciting eld. In this paper, the discovery of knowledge in the form of Horn clauses is described. A case stud...
متن کاملKnowledge Discovery from Structured Mammography Reports Using Inductive Logic Programming
The development of large mammography databases provides an opportunity for knowledge discovery and data mining techniques to recognize patterns not previously appreciated. Using a database from a breast imaging practice containing patient risk factors, imaging findings, and biopsy results, we tested whether inductive logic programming (ILP) could discover interesting hypotheses that could subse...
متن کاملA Logical Framework for Frequent Pattern Discovery in Spatial Data
In recent times, several extensions f data mining methods and techniques have been explored aiming at dealing with advanced databases. Many promising applications of inductive logic programming (ILP) to knowledge discovery in databases have also emerged inorder to benefit from semantics andinference rules of first-order logic. Inthis paper, an ILP framework forfrequent pattern discovery in spat...
متن کاملThe Parallelization of a Knowledge Discovery System with Hypergraph Representation
Knowledge discovery is a time-consuming and space intensive endeavor. By distributing such an endeavor, we can diminish both time and space. System INDED(pronounced \indeed") is an inductive implementation that performs rule discovery using the techniques of inductive logic programming and accumulates and handles knowledge using a deductive nonmonotonic reasoning engine. We present four schemes...
متن کاملSPADA: A Spatial Association Discovery System*
This paper presents a spatial association discovery system, named SPADA, which has been developed according to the theoretical framework of inductive databases. Our approach considers inductive databases as deductive databases with an integrated inductive component and relies on techniques borrowed from the field of Inductive Logic Programming (ILP). In SPADA, an ILP module supports the process...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 7 شماره
صفحات -
تاریخ انتشار 2013